Comparing Embeddings Based Search Methods and BM25 Results

Bert Staub • Location: Theater 4 • Back to Haystack 2020

Improvements to keyword searching is providing utility but at increasing complexity of development and cost of deployment. In other words the industry is advancing to the far right of the curve of the economic law of diminishing returns. At LexisNexis we have started to explore Embeddings Based Search using bag of words and bi-directional methods. During this session we will share our approach and compare the value and cost of vectorization based search and plain jane BM25.

Bert Staub

LexisNexis

Bert has been employed by LexisNexis for 20 years and has over 30 years of experience in IT. During his time at LexisNexis he has worked to advance search applications to improve the accuracy of results for his customers. Bert has developed knowledge management systems for dictionaries, lexicons, and classification systems used in the search layer. He has also led named entity recognition projects for both search and content tagging. He has helped implement search systems based on mainframes, map reduce, and cloud-based platforms. Currently his applied research team is working to develop deep learning features to improve the quality of our customers search experience.